Advocating for Automation:

Adapting Current Tools in Environmental Science through R

Hannah Podzorski

rstudio::conf(2022)

July 27, 2022

Why Automate?

Confused anime dude meme, where he's a programmer confusing a basic task with something that needs automation.

https://www.reddit.com/r/ProgrammerHumor/comments/f0ag3i/automation/

Reducing the Activation Energy

Line plot showing the progress of activation energy, or the energy needed to complete a chemical reation.

Same plot of activation energy showing that activation energy has been reduced.

Let’s Start with the Pitch

Differences in Workflow

Reactionary Workflow

Four boxes labeled A, B, C, and D.

Four boxes labeled A, B, C, and D with an asterisk next to A.

Four boxes labeled A, B, C, and D with an asterisks next to A and B.

Four boxes labeled A, B, C, and D with an asterisks next to A, B, and C.

Four boxes labeled A, B, C, and D with an asterisks next to A, B, C, and D.

Automated Workflow

Four boxes connected by arrows labeled A, B, C, and D.

Four boxes connected by arrows labeled A, B, C, and D with an asterisk next to A.

Four boxes connected by arrows labeled A, B, C, and D with an asterisk next to A and f(X) between each box.

Pros of Automation

  • Reproducibility
  • It can be simple!
  • Saves time (in the long run)
  • Less human interactive means less errors

Where to Start?

  • Start small, task can be automated in the same amount of time as the original task.
  • Meet team members where they are.

{openxlsx}

write.csv(data, "data.csv")

{openxlsx}

write.csv(data, "data.csv")

openxlsx::write.xlsx(data, "data.xlsx")

 

Example of formatted excel table.

{officer}

 

library(officer)

plot <- rvg::dml(ggobj = plot)

pptx <-read_pptx() %>%
  add_slide() %>%
  ph_with(plot, ph_location(left = 1.3, top = 0.4, width = 8.75, height = 6.9))

print(pptx, "./R/Fig-Example.pptx")

{officer}

 

library(officer)

plot <- rvg::dml(ggobj = plot)

pptx <-read_pptx() %>%
  add_slide() %>%
  ph_with(plot, ph_location(left = 1.3, top = 0.4, width = 8.75, height = 6.9))

print(pptx, "./R/Fig-Example.pptx")

{officer}

Delayed
Gratification

ProUCL

  • Statistical Software for Left Censored Environmental Data
    • Calculates Upper Confidence Limits (UCLs)
  • Developed by the U.S Environmental Protection Agency (EPA)

 

EPA logo

ProUCL Automation

ProUCL Output

Example of the output file from ProUCL

Data from California’s Groundwater Ambient Monitoring and Assessment Program (GAMA). Downloaded 2022-07-11.

Was automation the way to go?

Pros

  • Regulators are happy
  • Prevents human errors
  • Saves time

Cons

  • Not designed for longevity
  • Requires special set up

Final Thoughts

  • It’s ok to start small.
  • All skill sets welcome!
  • Ultimate goal is to leave more time for more important tasks.

Questions?

Slides and code availabe at, github.com/hannahpodzorski/advocating-for-automation

 

Contact Information:

hpodzorski@gsi-net.com

twitter - @hpodz

github - hannahpodzorski